Multi-task Learning with Gradient Guided Policy Specialization

نویسندگان

  • Wenhao Yu
  • Greg Turk
  • C. Karen Liu
چکیده

We present a method for efficient learning of control policies for multiple related robotic motor skills. Our approach consists of two stages: joint training and specialization training. During joint training, a single neural network policy is trained to perform multiple tasks. This forces the policy to learn a common representation of the different tasks. Then, during the specialization training stage we selectively split the weights of the policy based on a per-weight metric that measures the disagreement among the multiple tasks. By splitting part of the control policy, it can be further trained to specialize to each task. To update the control policy during learning, we use Proximal Policy Optimization. We evaluate our approach on two continuous control problems in simulation: 1) training three single-legged robots that have considerable difference in shape and size to hop forward and 2) training a 2D biped robot to walk forward and backward. We test our method with different joint training iterations and specialization amounts and compare our method to a random specialization scheme and a standard specialization scheme. Finally, we design a multitask problem where the inter-task similarity can be continuously controlled. We show that our method can help learning multitask problems for different types of robots and with different levels of similarities among the tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DiGrad: Multi-Task Reinforcement Learning with Shared Actions

Most reinforcement learning algorithms are inefficient for learning multiple tasks in complex robotic systems, where different tasks share a set of actions. In such environments a compound policy may be learnt with shared neural network parameters, which performs multiple tasks concurrently. However such compound policy may get biased towards a task or the gradients from different tasks negate ...

متن کامل

Online Multi-Task Learning for Policy Gradient Methods

Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sampleefficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring...

متن کامل

Learning Task Allocation via Multi-level Policy Gradient Algorithm with Dynamic Learning Rate

Task allocation is the process of assigning tasks to appropriate resources. To achieve scalability, it is common to use a network of agents (also called mediators) that handles task allocation. This work proposes a novel multi-level policy gradient algorithm to solve the local decision problem at each mediator agent. The higher level policy stochastically chooses a task decomposition. The lower...

متن کامل

Robotic Search & Rescue via Online Multi-task Reinforcement Learning

Reinforcement learning (RL) is a general and well-known method that a robot can use to learn an optimal control policy to solve a particular task. We would like to build a versatile robot that can learn multiple tasks, but using RL for each of them would be prohibitively expensive in terms of both time and wear-and-tear on the robot. To remedy this problem, we use the Policy Gradient Efficient ...

متن کامل

Graph-Structured Multi-task Regression and an Efficient Optimization Method for General Fused Lasso Manuscript

We consider the problem of learning a structured multi-task regression, where the output consists of multiple responses that are related by a graph and the correlated response variables are dependent on the common inputs in a sparse but synergistic manner. Previous methods such as l1/l2-regularized multi-task regression assume that all of the output variables are equally related to the inputs, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1709.07979  شماره 

صفحات  -

تاریخ انتشار 2017